-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[QNN EP] Support quantized BatchNorm with per-channel DQ params on QNN HTP #26959
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
From AISummaryThis PR enhances the QNN EP to support Key Changes
Review AnalysisCorrectness
Performance
ConclusionThis is a necessary fix for broad support of quantized models on Qualcomm hardware. The implementation includes necessary builder logic and verification tests. LGTM. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds support for quantized BatchNormalization with per-channel DequantizeLinear parameters on the QNN HTP backend, which is a common pattern in quantized models from quantization tools.
Changes:
- Refactored BatchNorm parameter preprocessing to support per-channel quantization through a new
MaybeDequantizeParamTensorhelper - Added support for UFIXED_POINT_16 and SFIXED_POINT_16 datatypes in BatchNorm operations
- Updated QDQ node group selector to accept 3-5 DQ nodes (previously required exactly 3)
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| onnxruntime/test/providers/qnn/batch_norm_test.cc | Added test case for BatchNorm with per-channel quantized parameters (scale, mean, var) |
| onnxruntime/core/providers/qnn/builder/opbuilder/batch_norm_op_builder.cc | Implemented per-channel dequantization support, added 16-bit datatype support, and refactored parameter resolution logic |
| onnxruntime/core/optimizer/qdq_transformer/selectors_actions/qdq_selectors.cc | Updated selector to allow variable number of DQ nodes (3-5) for BatchNorm inputs |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
onnxruntime/core/providers/qnn/builder/opbuilder/batch_norm_op_builder.cc
Outdated
Show resolved
Hide resolved
onnxruntime/core/providers/qnn/builder/opbuilder/batch_norm_op_builder.cc
Show resolved
Hide resolved
onnxruntime/core/providers/qnn/builder/opbuilder/batch_norm_op_builder.cc
Show resolved
Hide resolved
onnxruntime/core/providers/qnn/builder/opbuilder/batch_norm_op_builder.cc
Outdated
Show resolved
Hide resolved
adrianlizarraga
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couple questions
onnxruntime/core/providers/qnn/builder/opbuilder/batch_norm_op_builder.cc
Outdated
Show resolved
Hide resolved
onnxruntime/core/providers/qnn/builder/opbuilder/batch_norm_op_builder.cc
Outdated
Show resolved
Hide resolved
b6f72a0 to
86f3192
Compare
86f3192 to
3b58106
Compare
onnxruntime/core/providers/qnn/builder/opbuilder/batch_norm_op_builder.cc
Outdated
Show resolved
Hide resolved
|
/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI, Windows ARM64 QNN CI Pipeline, Windows GPU Doc Gen CI Pipeline |
|
Azure Pipelines successfully started running 4 pipeline(s). |
onnxruntime/core/providers/qnn/builder/opbuilder/batch_norm_op_builder.cc
Show resolved
Hide resolved
yuslepukhin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
![]()
|
/asp run MacOS CI Pipeline |
Motivation:
QNN HTP was rejecting quantized BatchNorm models where parameters (scale, mean, var) come through DequantizeLinear nodes with per-channel INT8 quantization. This pattern is common in quantized models from quantization tools.
Changes: